6 research outputs found

    When can the two-armed bandit algorithm be trusted?

    Full text link
    We investigate the asymptotic behavior of one version of the so-called two-armed bandit algorithm. It is an example of stochastic approximation procedure whose associated ODE has both a repulsive and an attractive equilibrium, at which the procedure is noiseless. We show that if the gain parameter is constant or goes to 0 not too fast, the algorithm does fall in the noiseless repulsive equilibrium with positive probability, whereas it always converges to its natural attractive target when the gain parameter goes to zero at some appropriate rates depending on the parameters of the model. We also elucidate the behavior of the constant step algorithm when the step goes to 0. Finally, we highlight the connection between the algorithm and the Polya urn. An application to asset allocation is briefly described

    Generalized Urn Models of Evolutionary Processes

    Full text link
    Generalized Polya urn models can describe the dynamics of finite populations of interacting genotypes. Three basic questions these models can address are: Under what conditions does a population exhibit growth? On the event of growth, at what rate does the population increase? What is the long-term behavior of the distribution of genotypes? To address these questions, we associate a mean limit ordinary differential equation (ODE) with the urn model. Previously, it has been shown that on the event of population growth, the limiting distribution of genotypes is a connected internally chain recurrent set for the mean limit ODE. To determine when growth and convergence occurs with positive probability, we prove two results. First, if the mean limit ODE has an ``attainable'' attractor at which growth is expected, then growth and convergence toward this attractor occurs with positive probability. Second, the population distribution almost surely does not converge to sets where growth is not expecte

    Edge-reinforced random walk, Vertex-Reinforced Jump Process and the supersymmetric hyperbolic sigma model

    Full text link
    Edge-reinforced random walk (ERRW), introduced by Coppersmith and Diaconis in 1986, is a random process, which takes values in the vertex set of a graph GG, and is more likely to cross edges it has visited before. We show that it can be represented in terms of a Vertex-reinforced jump process (VRJP) with independent gamma conductances: the VRJP was conceived by Werner and first studied by Davis and Volkov (2002,2004), and is a continuous-time process favouring sites with more local time. We calculate, for any finite graph GG, the limiting measure of the centred occupation time measure of VRJP, and interpret it as a supersymmetric hyperbolic sigma model in quantum field theory, introduced by Zirnbauer (1991). This enables us to deduce that VRJP and ERRW are positive recurrent in any dimension for large reinforcement, and that VRJP is transient in dimension greater than or equal to 3 for small reinforcement, using results of Disertori and Spencer (2010), Disertori, Spencer and Zirnbauer (2010).Comment: 23 pages, 1 figur

    Prospective individual patient data meta-analysis of two randomized trials on convalescent plasma for COVID-19 outpatients

    Full text link
    Data on convalescent plasma (CP) treatment in COVID-19 outpatients are scarce. We aimed to assess whether CP administered during the first week of symptoms reduced the disease progression or risk of hospitalization of outpatients. Two multicenter, double-blind randomized trials (NCT04621123, NCT04589949) were merged with data pooling starting when = 50 years and symptomatic for <= 7days were included. The intervention consisted of 200-300mL of CP with a predefined minimum level of antibodies. Primary endpoints were a 5-point disease severity scale and a composite of hospitalization or death by 28 days. Amongst the 797 patients included, 390 received CP and 392 placebo; they had a median age of 58 years, 1 comorbidity, 5 days symptoms and 93% had negative IgG antibody-test. Seventy-four patients were hospitalized, 6 required mechanical ventilation and 3 died. The odds ratio (OR) of CP for improved disease severity scale was 0.936 (credible interval (CI) 0.667-1.311); OR for hospitalization or death was 0.919 (CI 0.592-1.416). CP effect on hospital admission or death was largest in patients with <= 5 days of symptoms (OR 0.658, 95%CI 0.394-1.085). CP did not decrease the time to full symptom resolution

    Pièges des algorithmes stochastiques et marches aléatoires renforcées par sommets

    No full text
    CACHAN-ENS (940162301) / SudocSudocFranceF

    When can the two-armed bandit algorithm be trusted ?

    No full text
    SIGLEAvailable from INIST (FR), Document Supply Service, under shelf-number : 22522, issue : a.2002 n.14 / INIST-CNRS - Institut de l'Information Scientifique et TechniqueFRFranc
    corecore